Advanced Python Programming¶

Tomas Kontrimas, Elena Manao, Dr. Martin Wolf

Advanced Python Programming, WS 2023/24

In [3]:
class Word:
    def __init__(self, word):
        self.word = word
    def __add__(self, rhs):
        return Word(self.word + ' ' + str(rhs))
    def __str__(self):
        return self.word

hello = Word('Hello')
world = Word('World')
print(hello + world)
Hello World

Learning objectives of this course¶

  • Understanding the concepts of object-oriented programming (OOP), in particular in Python
  • Understanding inheritance in OOP
  • Some useful software packages for data processing and plotting
  • Working with version-control systems like git and github.com

Technical Prerequisites Of This Course¶

  • Computer with Python and Jupyter Notebook support

    • Linux, Windows, Mac
  • Alternatively, create and use Jupyter notebooks on google: https://colab.research.google.com

  • Microsoft Visual Studio Code is a integrated developer environment (IDE) supporting Python and Jupyter notebooks as well.

  • github.com account

Why do we need to be able to develop software in physics?¶

  • For most of our professional time we work with ...

    • data taking
    • data simulation
    • data management
    • data selection
    • data analysis
  • All of the above needs software that is ...

    • efficient
    • modular
    • well maintainable
    • scalable

Example: Trigger rate of the IceCube Neutrino Detector at South Pole:

Trigger rate of the Icecube Neutrino Detector at South Pole

Overview¶

DON'T PANIC!

Just read the instructions: https://docs.python.org/3/

Google is your friend!

Intro and course setup¶

  • Jupyter Notebook and virtual Python environments on your own computer
  • Jupyter Notebook on Google Colab

The basics of Python¶

  • Data types in Python
  • Python as a calculator
  • String formatting
  • Loops and Conditions
  • List comprehension
  • Working with files
  • Functions
  • Lambda functions

Advanced basics of Python¶

  • Exceptions
  • Decorators
  • Nested functions
  • PEP8 Coding Style
  • Common pitfalls

Advanced packages¶

  • numpy for data processing
  • pandas for structural data processing
  • scipy for scientific calculations
  • matplotlib for plotting

Object-Oriented Programming (OOP)¶

  • Classes
  • Properties
  • Class methods
  • Static methods
  • Operator methods
  • Inheritance
  • Multiple inheritance
  • Abstract base classes
  • Iterator protocol
  • Context Manager
  • Generators

Working with version control systems (git)¶

  • What is a version control system?
  • What is git?
  • Cloning a repository
  • Working with branches
  • Commiting code changes
  • Merging two branches
  • Resolving merge conflicts
  • Pull requests on github.com

Jupyter Notebook and virtual Python environments on your own computer¶

This shows the installation of Jupyter and Python virtual environments on Linux (Ubuntu).

First, let's install Jupyter via apt-get:

sudo apt-get install python3-notebook python3-ipykernel

For this course, I recommend to setup a dedicated virtual Python 3 environment:

sudo apt-get install virtualenvwrapper
source /usr/share/virtualenvwrapper/virtualenvwrapper.sh

The environment is created via:

mkvirtualenv -p /usr/bin/python3 --system-site-packages basic-py3-course
pip install -U astropy ipympl matplotlib numpy scipy

The final step is to add the virtual Python environment to Jupyter Notebook:

python -m ipykernel install --user \
    --name basic-py3-course --display-name "Python 3 (basic-py3-course)"

I further recommend to put the

source /usr/share/virtualenvwrapper/virtualenvwrapper.sh

line into your .profile and launch a new terminal window.

You can start a Jupyter Notebook via:

jupyter notebook

and create a new basic-py3-course notebook.

For executing Python scripts, you can activate the basic-py3-course environment via:

workon basic-py3-course

and deactivate it via:

deactivate

Jupyter Notebook on Google Colab¶

  • Go to https://colab.research.google.com and create a new notebook
  • To install new packages, type !pip install <package>

The basics of Python¶

  • Python is an interpreted programming language (compilation into byte-code before interpretation).
  • Python is a loosely typed programming language.
  • The latest major version of Python is Python3. Don't use Python2 anymore!

The Zen of Python (PEP 20)¶

In [1]:
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
In [5]:
# Comments start with '#'.
In [6]:
# Import modules.
import os

import matplotlib.pyplot
import numpy as np
In [7]:
# Import only specific functions from a module.
from scipy import interpolate, stats
In [8]:
# Remember: never ever do 'from foo import *'
In [9]:
# Declare a variable.
a = 5.
a = "Hello World!"

Data types¶

  • Numeric types:
    int, float, complex
  • Sequence types:
    str, unicode, list, tuple, bytearray, buffer
  • Set types:
    set, frozenset
  • Map types:
    dict
  • Boolean types: bool
  • The None type object: None
  • Mutable data types like list and dict are copied by reference.
  • Immutable data types like int and float are copied by value.
  • Full list:
    https://docs.python.org/3.7/library/stdtypes.html#

Python as a calculator¶

In [10]:
# Addition and subtraction
print("5.  + 5. =", 5. + 5.)
print("10. - 5. =", 10. - 5.)
5.  + 5. = 10.0
10. - 5. = 5.0
In [1]:
# Multiplication, division, and floor division
print("5.  * 5. =", 5. * 5.)
print("25. / 5. =", 25. / 5.)
print("5.3 // 2 =", 5.3 // 2)
5.  * 5. = 25.0
25. / 5. = 5.0
5.3 // 2 = 2.0
In [12]:
# Modulo and exponentiation
print("25. % 4. =", 25. % 4.)
print("5.**2    =", 5.**2)
25. % 4. = 1.0
5.**2    = 25.0
In [13]:
# Do calculations with variables.
a = 5.
print("a + 5. =", a + 5.)
a + 5. = 10.0
In [14]:
b = 4.
c = a + b
print("c = a + b =", c)
c = a + b = 9.0
In [15]:
# In-place modifications
a += 5.
print("a =", a)
a = 10.0
In [16]:
# Call mathematical functions like sin, exp, log, ...
# Instead of NumPy, you can also use the 'math' module.
print("exp(2.5)   =", np.exp(2.5))
print("sin(pi/2.) =", np.sin(np.pi/2.))
exp(2.5)   = 12.182493960703473
sin(pi/2.) = 1.0

String formatting¶

General form:

[fill][align][sign][#][0][width][,][.precision][type]

See:
https://docs.python.org/3/library/string.html#format-string-syntax

In [17]:
# Some examples
print("{:>+10.2f}".format(np.pi))
print("{:010d}".format(3))
     +3.14
0000000003
In [18]:
# Some more examples
s = "{0}, {1[0]}, {1[1]}, {2[a]}"
print(s.format(0, [1, 2], {"a": 3}))
print("{key}".format(key=4))
0, 1, 2, 3
4

F-string (formatted string literal) introduced in Python 3.6 supports the same string formatting form as .format(). They can embed Python expressions directly inside them using f"string example {expression:format}" syntax.

In [19]:
# f-string examples
a = 10.5
print(f"a = {a}")
print(f"a = {a:.0f}")
print(f"Evaluate some code: {5*5:.2f}")
print(f"Evaluate some more code: {np.sqrt(9):.0f}")
a = 10.5
a = 10
Evaluate some code: 25.00
Evaluate some more code: 3

Loops and Conditions¶

In [20]:
# Example for-loop with conditions
for i in range(3):
    if i == 0:
        print("Skip 0.")
    elif i % 2 > 0:
        print("{} is an odd  number.".format(i))
    else:
        print("{} is an even number.".format(i))
Skip 0.
1 is an odd  number.
2 is an even number.
In [21]:
# Example while-loop
d = {"a": 0, "b": 1}

while len(d) > 0:
    d.popitem()
  • You can break out of a loop via break.
  • Iterations can be skipped via continue.
  • Remember that Python does not really know about scopes.

For loops can have an else-block, which gets executed when the loop finished normally, i.e. without encountering a break:

In [5]:
for i in [1, 2, 3]:
    if i == 0:
        break
else:
    print('No 0 found!')
No 0 found!
In [6]:
for i in [0, 1, 2, 3]:
    if i == 0:
        break
else:
    print('No 0 found!')

List comprehension¶

An easy way to create iterables from existing ones.

In [22]:
# List comprehension examples
l = [i**2 for i in range(10)]
l = [10. / i for i in range(10) if i > 0]
l = [10. / i if i > 0 else np.inf for n in range(10)]
l = [[a*i for a in [1, 2]] for i in range(1, 11)]
In [23]:
# The same works for dictionaries.
d = {"a": 2., "b": 3.}
d = {k: v**2 for k, v in d.items()}
In [24]:
# And of course sets
s = {i**2 for i in range(10)}
In [25]:
# Some generator magic
g = (i**2 for i in range(10))
r = sum(g)

# We can also do this in one line.
print(sum(i**2 for i in range(10)))
285

Working with files¶

In [26]:
# Create a text file and give it some input.
with open("example.txt", "w") as stream:
    stream.write("This is a line.\n")
    stream.write("This is another line.")
In [27]:
# Read the content of the text file.
with open("example.txt", "r") as stream:
    lines = stream.readlines()

print("".join(lines))
This is a line.
This is another line.

Random numbers¶

In [29]:
import random

# Initialize the random number generator with a seed.
# Use None as seed for using the current time.
random.seed(42)

# Return the next floating point number between [0 and 1).
print(random.random())

# Return a uniformly distributed floating point number in the range [a, b].
a = 1
b = 6
print(random.uniform(a, b))

# See numpy how to generate a sequence of random numbers.
0.6394267984578837
1.1250537761133348

Functions¶

In [30]:
# A simple function that takes two arguments.
def power(a, e=2.):
    r"""Exponentiation

    Calculate: ``a**e``.

    Parameters
    ----------
    a : float
        Some number
    e : float, optional
        Exponent

    Returns
    -------
    float
        Result of ``a**e``

    """
    return a**e
In [31]:
print("5.**2 =", power(5.))
5.**2 = 25.0
In [32]:
# Alternative ways to call the function
results = [
    power(5., 2.),
    power(5., e=2.),
    power(a=5., e=2.),
    power(e=2., a=5.),
    power(*[5., 2.]),
    power(**{"a": 5., "e": 2.})
]
In [33]:
print("Results:", ", ".join("{}".format(r) for r in results))
Results: 25.0, 25.0, 25.0, 25.0, 25.0, 25.0
In [34]:
# General syntax
def func(*args, **kwargs):
    print(f'args: {args}')
    print(f'kwargs: {kwargs}')
In [35]:
func(3 ,'a', a=2, b='1')
args: (3, 'a')
kwargs: {'a': 2, 'b': '1'}

A few words about doc strings:

  • Do write them because they help to better understand your code.
  • We recommend the NumPy style:
    https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html

Lambda functions¶

Lambdas are unnamed in-line functions.

In [36]:
# Example for lambda usage
print(sorted("String"))
print(sorted("String", key=lambda s: s.lower()))
['S', 'g', 'i', 'n', 'r', 't']
['g', 'i', 'n', 'r', 'S', 't']
In [37]:
# A lambda function can be assigned to a variable.
square = lambda x: x**2
print("5**2 =", square(5.))
5**2 = 25.0

Exceptions¶

In [38]:
# Write a function that can raise an exception.
def division(a, b):
    r"""Divide `a` by `b`.

    Parameters
    ----------
    a,b : float
        Nominator and denominator

    Returns
    -------
    float
        The result of ``a/b``

    Raises
    ------
    ValueError
        If the denominator `b` is zero.
    """
    if np.fabs(b) > 0.:
        return a / b
    else:
        raise ValueError("The denominator should not be zero.")
In [39]:
# Catch the exception.
try:
    r = division(5., 0.)
except ValueError as e:
    print(e)
    r = np.inf
finally:
    print("The result is {}.".format(r))
The denominator should not be zero.
The result is inf.

A full list of built-in exceptions can be found here:
https://docs.python.org/3/library/exceptions.html

The most common exception types that one wants to raise are:

  • TypeError
  • ValueError
  • NameError
  • IndexError
  • KeyError
  • NotImplementedError

Decorators¶

Decorators are syntactic sugar for functions that are decorating other functions.

  • Decorator functions take a callable as argument.
  • Decorator functions should call the given callable argument.
In [40]:
# Example of a simple decorating function.
def paragraph(f):
    return lambda name: "<p>{}</p>".format(f(name))
In [41]:
@paragraph
def greet(name):
    return "Hello {}, how are you?".format(name)
In [42]:
print(greet("John"))
<p>Hello John, how are you?</p>

Nested functions¶

Python allows to define functions within functions.

  • Can be used to "configure" a function object.
  • Inner function gets scope of outer function.
In [43]:
def preparing_to_feed_cat(what_to_feed):

    def feed_cat(how_much_to_feed):
        msg = f"I'm feeding my cat with {how_much_to_feed} {what_to_feed}."
        return msg

    return feed_cat

# Create a function object that knows what to feed the cat with.
feed_cat_func = preparing_to_feed_cat(what_to_feed='cans of meat')

print(feed_cat_func(how_much_to_feed=3))
I'm feeding my cat with 3 cans of meat.

PEP8 Coding style¶

Please stick to the 'official' style guide for Python code:
https://www.python.org/dev/peps/pep-0008/

Useful software tools:

  • flake8 is a popular linter for python. It checks against PEP8 coding style and for programming errors.
  • black is the uncompromising Python code formatter. It automatically formats code to conform to the PEP8 compliant coding style.

Common pitfalls¶

The Python language has some particular properties that might lead to some not-so obvious pitfalls.

  • Objects are passed by reference (not by copy!)
    • except scalar data types like int, float, complex
In [44]:
a = 42
def f(b):
    b = 11
f(a)
print(a)
42

The scalar variable is passed as copy.

In [45]:
a = [42]
def f(b):
    c = b
    c.append(11)
f(a)
print(a)
[42, 11]

The list a is passed as reference. Hence, b referes to the same object a in memory. The same applies for name changes of variables. c refers to the same object as b.

  • Function keyword argument default values are instanciated only once
    • lists as keyword argument default values can be tricky
In [46]:
def f(a=[0]):
    # Append a new element with the value as
    # the maximum of all list values plus 1.
    a.append(max(a) + 1)
    return a
In [47]:
print(f())
[0, 1]
In [48]:
print(f())
[0, 1, 2]

The function changes the one-and-only list object that was instantiated as default keyword argument value.

NumPy¶

From the NumPy webpage:

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • tools for integrating C/C++ and Fortran code
  • useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

NumPy tutorial (pay special attention to the section about broadcasting rules):
https://docs.scipy.org/doc/numpy/user/quickstart.html

Importing numpy¶

In [ ]:
import numpy as np

Creating numpy arrays¶

Numpy arrays can be created from any iterable via the numpy.array() function:

In [ ]:
a_list = [1, 2, 5, 3, 7, 10]  # A list.
arr = np.array(a_list)  # Instantiation of the numpy.ndarray class.
print(a_list*3)  # Repeat list 3 times.
print(arr*3)  # Multiply every array item with 3.
[1, 2, 5, 3, 7, 10, 1, 2, 5, 3, 7, 10, 1, 2, 5, 3, 7, 10]
[ 3  6 15  9 21 30]

The array's data type can be specified:

In [ ]:
arr = np.array(a_list, dtype=np.float64)
print(arr)
print(arr.dtype)
[ 1.  2.  5.  3.  7. 10.]
float64

Numpy arrays have a dimensionality and a shape:

In [ ]:
print(arr)
print(f'arr.ndim={arr.ndim}')
print(f'arr.shape={arr.shape}')
[ 1.  2.  5.  3.  7. 10.]
arr.ndim=1
arr.shape=(6,)

Numpy arrays can be reshaped:

In [ ]:
arr = arr.reshape((3,2))
print(arr)
print(f'arr.ndim={arr.ndim}')
print(f'arr.shape={arr.shape}')
[[ 1.  2.]
 [ 5.  3.]
 [ 7. 10.]]
arr.ndim=2
arr.shape=(3, 2)

Math functions¶

Numpy provides many math functions to be operated on an entire array:

In [23]:
arr = np.arange(3)
print(f'arr = {arr}')
print(f'np.sin(arr) = {np.sin(arr)}')
print(f'np.exp(arr) = {np.exp(arr)}')
print(f'np.power(10, arr) = {np.power(10, arr)}')
arr = [0 1 2]
np.sin(arr) = [0.         0.84147098 0.90929743]
np.exp(arr) = [1.         2.71828183 7.3890561 ]
np.power(10, arr) = [  1  10 100]

Statistical functions¶

Numpy provides a few useful statistical functions:

In [29]:
arr = np. arange(10)
print(f'arr = {arr}')
print(f'np.mean(arr) = {np.mean(arr)}')
print(f'np.median(arr) = {np.median(arr)}')
print(f'np.std(arr) = {np.std(arr)}')
arr = [0 1 2 3 4 5 6 7 8 9]
np.mean(arr) = 4.5
np.median(arr) = 4.5
np.std(arr) = 2.8722813232690143

Sorting functions¶

Numpy provides several sorting algorithms via the numpy.sort function: quicksort, heapsort, mergesort, timsort

In [24]:
unsorted_arr = np.array([3, 4, 2, 1, 6, 5])
sorted_arr = np.sort(unsorted_arr, kind='heapsort')
print(f'sorted_arr = {sorted_arr}')
sorted_arr = [1 2 3 4 5 6]
In [25]:
indices_ordering = np.argsort(unsorted_arr)
print(f'indices_ordering = {indices_ordering}')
indices_ordering = [3 2 0 1 5 4]

Broadcasting of numpy arrays¶

Two numpy arrays of different dimensions can be used in an expression by broadcasting the array elements of the smaller dimensional array by repeatedly prepending a dimension of length 1 to smaller array to allow element-wise operation:

In [ ]:
import numpy as np

a = np.array([[1,2,3], [4,5,6]])
b = np.array([7,8,9])
c = a + b
print('shape of a:', a.shape)
print('shape of b:', b.shape)
print('shape of c:', c.shape)
print(c)
shape of a: (2, 3)
shape of b: (3,)
shape of c: (2, 3)
[[ 8 10 12]
 [11 13 15]]

Indexing / Slicing of numpy arrays¶

Elements of a numpy arrays can be accessed via indexing or slicing:

In [ ]:
# Access by index (indices start with 0!):
print(a[0,2])
print(a[ [0,1,1], [0,1,2] ])
3
[1 5 6]
In [ ]:
# Access by slice: start:end:step:
# Note: The start index is included, but the end index is excluded!
a = np.arange(0, 10)
print(a)
print(a[3:7])
print(a[3:7:2])
# Step can be negative to reverse the slice.
print(a[7:3:-2])
[0 1 2 3 4 5 6 7 8 9]
[3 4 5 6]
[3 5]
[7 5]

Masking of numpy arrays¶

A mask for an array is a boolean array selecting elements of an array.

In [ ]:
a = np.arange(0, 5)
mask = np.array([True, False, False, True, True])
print(a)
print(a[mask])
[0 1 2 3 4]
[0 3 4]
In [ ]:
mask = (a == 0) | (a > 2)
print(mask)
print(a[mask])
[ True False False  True  True]
[0 3 4]

Numpy structured arrays¶

Numpy supports arrays with field names, so called structured arrays.

  • Useful for representing a data table.
  • Different columns (fields) can have different data types.
  • All columns must have the same number of entries.
  • Structured arrays are defined via a structured data type object.
  • Structured arrays have dimension 1.
In [ ]:
dt = [('a', np.int8), ('b', bool), ('c', np.float32)]
arr = np.array([(3, True, 3.3), (2, False, 2.2), (4, False, 4.4)], dtype=dt)
In [ ]:
print('arr.size =', arr.size)
print('arr.ndim =', arr.ndim)
# Retrieve a table row.
print('arr[1] =', arr[1])
arr.size = 3
arr.ndim = 1
arr[1] = (2, False, 2.2)
In [ ]:
# Masking on columns.
mask = arr['a'] >= 3
print('mask =', mask)
print('arr[mask] =', arr[mask])
mask = [ True False  True]
arr[mask] = [(3,  True, 3.3) (4, False, 4.4)]
In [ ]:
# Setting column content.
arr['a'] = [42, 41, 40]
print(repr(arr['a']))
array([42, 41, 40], dtype=int8)
In [ ]:
# Setting a table row. A tuple has to be used!
arr[1] = (7, True, 7.7)
print(arr)
[(42,  True, 3.3) ( 7,  True, 7.7) (40, False, 4.4)]

Random numbers with numpy¶

See numpy.org/doc/stable/reference/random/index.html for documentation.

In [ ]:
# Create a random number generator with a seed.
rng = np.random.default_rng(42)

# Generate next random number(s) in range [0,1).
print(rng.random(size=5))

# Generate a numpy array with uniform random numbers in the range [1, 6].
a, b = 1, 6
rng.uniform(a, b, size=(5,2))
[0.77395605 0.43887844 0.85859792 0.69736803 0.09417735]
array([[5.87811176, 4.80569851],
       [4.93032153, 1.64056816],
       [3.25192969, 2.85399012],
       [5.63382494, 4.2193256 ],
       [5.11380807, 3.21707099]])

Pandas¶

The pandas package provides user-friendly record arrays as DataFrame objects. 1D arrays are represented by Series objects.

  • See pandas.pydata.org
  • Table rows can be indexed for faster access.
  • See The 10 min guide (pandas.pydata.org/docs/user_guide/10min.html)
In [9]:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]})
print(df)
   A
0  1
1  2
2  3
In [10]:
arr = pd.Series([1, 2, 3])
print(arr)
0    1
1    2
2    3
dtype: int64
In [11]:
# Show data types:
df.dtypes
Out[11]:
A    int64
dtype: object
In [12]:
# describe() shows a quick statistic summary of your data:
df.describe()
Out[12]:
A
count 3.0
mean 2.0
std 1.0
min 1.0
25% 1.5
50% 2.0
75% 2.5
max 3.0
In [13]:
# Slicing rows.
df[0:2]
Out[13]:
A
0 1
1 2

Scipy¶

Scipy is a package providing functionality for scientific calculations like:

  • numerical integration
  • minimization / curve fitting
  • interpolation, e.g. splines
  • statistical function, e.g. various probability distributions
  • ...

Scipy Documentation

Numerical integration¶

scipy.integrate.quad(func, a, b[, args, full_output, ...]) provides an integrator for a finite integration of a function.

Minimization / curve fitting¶

The scipy.optimize module provides various functions for optimization purposes, like minimization and curve fitting.

  • scipy.optimize.minimize minimizes a function using various algorithms.

  • scipy.optimize.curve_fit(f, xdata, ydata[, p0, sigma, ...])) fits a function to data.

Interpolation¶

The scipy.interpolate module provides various interpolation classes, which provide b-spline representations of data points.

Statistical functions¶

The scipy.stats module provides various functions for statistical quantities and probability distributions.

Matplotlib¶

From the Matplotlib webpage:

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.

In [49]:
# Call this magic function in the first line of your Jupyter notebook:
#%matplotlib notebook
# If that doesn't work, use
#%matplotlib widget
In [50]:
# Example: the final plot looks like this:
import matplotlib
fig = matplotlib.pyplot.figure(figsize=(6, 3))
In [51]:
# Split the figure into an array of fields: ncols x nrows.
grid = matplotlib.pyplot.GridSpec(ncols=1, nrows=1)
In [52]:
# Add a subplot by specifying column and row.
# You can also use slicing here for combining fields.
ax = fig.add_subplot(grid[0, 0])
In [53]:
# Input data
xval = np.linspace(0., 2.*np.pi, 101)
yval = np.sin(xval)
In [54]:
# Plot the input data:
# Combine the data points with a solid line.
ax.plot(xval, yval, "-", label="This is the sine function.")
Out[54]:
[<matplotlib.lines.Line2D at 0x7f506aa4f7f0>]
In [55]:
# Axes formatting
ax.set_xlim(xval[0], xval[-1])
ax.set_ylim(-1., 1.)
Out[55]:
(-1.0, 1.0)
In [56]:
# Set the major ticks location and format of the x-axis.
ax.xaxis.set_major_locator(matplotlib.ticker.FixedLocator([
    0, np.pi/2, np.pi, 3/2*np.pi, 2*np.pi]))
ax.xaxis.set_major_formatter(matplotlib.ticker.FixedFormatter([
    "$0$", r"$\frac{\pi}{2}$", r"$\pi$", r"$\frac{3\pi}{2}$",
    r"$2\pi$"]))
In [57]:
# For linear ticks, Matplotlib is pretty good in figuring out the
# location of minor ticks.
ax.yaxis.set_major_locator(
    matplotlib.ticker.LinearLocator(numticks=5))
ax.yaxis.set_minor_locator(
    matplotlib.ticker.AutoMinorLocator())
In [58]:
# Labels
ax.set_xlabel(r"$\alpha$")
ax.set_ylabel(r"$\sin(\alpha)$")
Out[58]:
Text(26.097222222222214, 0.5, '$\\sin(\\alpha)$')
In [59]:
# Legend
ax.legend(loc="upper right")
Out[59]:
<matplotlib.legend.Legend at 0x7f506aa18640>
In [60]:
# Shrink the axes in order to fit in the labels.
grid.tight_layout(fig)
plt.show()
In [61]:
# Save the figure.
fig.savefig("example.pdf")

Skymaps with Matplotlib¶

  • Skymaps are just special projections of a plot.
In [63]:
fig = plt.figure(figsize=(6, 3))
ax = fig.add_subplot(projection="mollweide")
plt.title("Mollweide Projection")
plt.grid(True)
plt.show()
  • The coordinate system is longitude (0 to $2\pi$) and latitude ($-\pi/2$ to $+\pi/2$).
  • The coordinates are specified in radians.
  • Use the usual plot methods of the Axes instance, like plot for plotting a point or line.
In [64]:
# Draw a line.
fig = plt.figure(figsize=(6, 3))
ax = fig.add_subplot(projection="mollweide")
plt.title("Mollweide Projection")
plt.grid(True)
line = ax.plot(np.deg2rad([-90, 120]), np.deg2rad([-45, 60]))
plt.tight_layout()
plt.show()

Plotting a map from array data¶

  • Assume one has a sky map given as a 2D numpy array, one can use pcolormesh to plot it.
In [65]:
# Create the data as a 2D numpy ndarray.
data = np.zeros((128, 64), dtype=np.float64)
data[54:74, 22:42] = np.reshape(np.random.normal(size=20*20), (20,20))
# Create mesh grid for the coordinates of the data bins.
lon = np.linspace(-np.pi, np.pi, data.shape[0])
lat = np.linspace(-np.pi/2, np.pi/2, data.shape[1])
(Lon, Lat) = np.meshgrid(lon, lat)
In [66]:
fig = plt.figure(figsize=(6,3))
ax = fig.add_subplot(projection="mollweide")
im = ax.pcolormesh(Lon, Lat, data.T, cmap=plt.cm.seismic,
                   vmin=-4, vmax=4)
plt.colorbar(im, label='random normal values')
plt.grid(True)
  • More user friendly skymap plotting functions are provided by the healpy package: healpy.readthedocs.io/en/latest/newvisufunc_example.html

Creating a matplotlib animation¶

Matplotlib has the module matplotlib.animation which provides functionality to create animated plots.

  • After a plot is drawn, an animation function can be defined that updates the artists of the plot.
  • The matplotlib.animation.FuncAnimation class can be used to animate a function of the form func(frame_number).
  • See matplotlib.org/stable/api/_as_gen/matplotlib.animation.FuncAnimation.html

Let's animate a point within a plot. The point will move forward on the x-axis.

In [ ]:
# Import matplotlib and configure it so show animations in jupyter.
from matplotlib import pyplot as plt
from matplotlib import animation

# To visualize animations correctly, we use the jshtml backend.
from matplotlib import rc
rc('animation', html='jshtml')
In [ ]:
# Define functions to plot a point and to prepare the animation.
def draw_point():
    # Plot a blue point at (0,0). The plt.plot function returns
    # a list of Line2D objects.
    point = plt.plot(0, 0, marker='o', color='blue')[0]
    return point

def prepare_animation():
    point = draw_point()

    def animate(frame_number):
        # In each frame we update the x-coordinate of the
        # Line2D object.
        point.set_xdata([frame_number])
        # The animate function needs to return a list of
        # artists that will be redrawn.
        return [point]

    return animate
In [ ]:
(fig, ax) = plt.subplots()
plt.title('Simple Animation')
plt.xlim(-1, 300)

# Define how long in seconds the animation should run.
length = 10
# Define how many frames per second we want to animate.
fps = 25
# Calculate the delay between frames in milliseconds.
# Keep in mind that we have a frame at the start AND
# at the end of the animation.
delay = int(1000 / (fps-1))

# Prepare the animation by plotting the point.
animation_func = prepare_animation()

# Animate the animation function.
ani = animation.FuncAnimation(
    fig,
    animation_func,
    frames=int(fps * length), # Number of frames to animate.
    interval=delay, # Delay between frames in milliseconds.
    repeat=False,
    blit=True # This will use less ressources.
)
In [ ]:
ani
In [ ]:
# To convert the animation into a HTML5 video one can do:
from IPython.display import HTML
HTML(ani.to_html5_video())
Your browser does not support the video tag.

Celestial Coordinate transformations¶

  • Commonly used (spherical) coordinate systems:

    • horizontal local coordinate system (azimuth / altitude)
    • equatorial system (right-ascention $\alpha$ / declination $\delta$)
    • galactic (galactic length $l$ / galactic width $b$)
  • The astropy.coordinates module provides classes for coordinate transformations.

  • Documentation: docs.astropy.org/en/stable/coordinates

In [67]:
from astropy import units as u
from astropy.time import Time
from astropy.coordinates import SkyCoord, AltAz, EarthLocation
In [68]:
# Define location in sky.
TXS = SkyCoord(ra=77.358*u.degree, dec=5.693*u.degree, frame='icrs')
In [69]:
# Transformation into galactic coordinates.
print(TXS.galactic)
<SkyCoord (Galactic): (l, b) in deg
    (195.40542828, -19.63626563)>
In [70]:
# Transform to local horizontal system at South Pole.
southpole = EarthLocation.from_geodetic(
    lon=0*u.degree, lat=-90*u.degree)
TXS_local = TXS.transform_to(AltAz(
    obstime=Time(56000, format='mjd'),
    location=southpole))
print(TXS_local)
<SkyCoord (AltAz: obstime=56000.0, location=(3.91862092e-10, 0., -6356752.31424518) m, pressure=0.0 hPa, temperature=0.0 deg_C, relative_humidity=0.0, obswl=1.0 micron): (az, alt) in deg
    (265.51023729, -5.70561117)>
In [71]:
# Transform to local horizontal system at Garching (CIP-Pool1) now.
import time
garching = EarthLocation.from_geodetic(
    lat=48.26695746469105*u.degree, lon=11.67676875270207*u.degree)
TXS_local = TXS.transform_to(AltAz(
    obstime=Time(time.time(), format='unix'),
    location=garching))
print(TXS_local)
<SkyCoord (AltAz: obstime=1683273201.4854765, location=(4165583.55669134, 860889.61251078, 4736687.19293947) m, pressure=0.0 hPa, temperature=0.0 deg_C, relative_humidity=0.0, obswl=1.0 micron): (az, alt) in deg
    (90.11878672, 7.7833946)>

Object-Oriented Programming (OOP)¶

From Wikipedia:

Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can
contain data and code: data in the form of fields (often known as attributes or properties), and code,
in the form of procedures (often known as methods).

In contrast to sequential global programming, object-oriented programming allows to encapsulate sets of functionalities into classes.

Classes can be instantiated as instances, i.e. objects of a particular class.

Example:

Given class Animal, the object cat and object dog could be instances of class Animal.

The class Animal can have data fields, i.e. properties, like name, and color.

Classes¶

  • A class is an object in Python, i.e. a class object.
  • A class always has a constructor.
  • A class can have attributes.
    • In Python everything is accessible, i.e. public.
    • Convention: Private attributes start with an underscore (_).
  • A class can have instance methods (functions of the class available to an instance of the class).
  • A class can have class methods (functions of the class available to the class object and an instance of the class).
  • A class can have special methods like __str__.
  • Classes can be derived from other classes (one or many) (see inheritance).
In [72]:
# A not very useful class.
class MyClass(object):
    pass
In [73]:
# MyClass is a class object!
# Let's look at its attributes:
dir(MyClass)
Out[73]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__']
In [74]:
# A simple class with a constructor, two attributes, and a
# special method.
class Direction(object):
    r"""Directional vector in spherical coordinates.

    Attributes
    ----------
    azimuth : float
        Azimuth angle in rad
    zenith : float
        Zenith angle in rad
    """
    def __init__(self, azimuth, zenith):
        """This is the constructor method.
        """
        self.azimuth = azimuth
        self.zenith = zenith

    def __str__(self):
        """This is a special class method to generate
        output for `str(obj)`.
        """
        return f"({self.azimuth}, {self.zenith})"
In [75]:
d = Direction(np.pi/3., np.pi/2.)
print("Direction:", d)
Direction: (1.0471975511965976, 1.5707963267948966)

Properties¶

  • Properties tie data attributes to setter, getter, deleter methods.
    • Allows for data attribute control
In [76]:
class A(object):
    r"""Class `A` with the property `a`,
    which must be a positive number.
    """
    def __init__(self):
        self._a = 0

    @property
    def a(self):
        r"""int: A positive number
        """
        return self._a

    @a.setter
    def a(self, val):
        if val >= 0:
            self._a = val
        else:
            raise ValueError(
                "Expect a positive number.")
In [77]:
a = A()
a.a = 42
print(a.a)
42
In [78]:
a.a = -1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-78-9af0c80ce669> in <module>
----> 1 a.a = -1

<ipython-input-76-7eed1312c01a> in a(self, val)
     17             self._a = val
     18         else:
---> 19             raise ValueError(
     20                 "Expect a positive number.")

ValueError: Expect a positive number.

Instance methods¶

Instance methods have the reference self to the class instance as first argument. By definition the constructor method is an instance method as well.

In [79]:
class A(object):
    r"""A class with an instance method.
    """
    def __init__(self, a):
        self.a = a

a = A(42)
print(a.a)
42

The @classmethod decorator¶

The @classmethod decorator defines class methods. Class methods have a reference to a class object as first argument, i.e. a reference to the class object of the class instance itself. This allows for example to implement more than one constructor for a class.

In [80]:
class A(object):
    r"""A class with class method.

    """
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __str__(self):
        return "a = {o.a}, b = {o.b}".format(o=self)

    @classmethod
    def from_tuple(cls, t):
        return cls(t[0], t[1])
In [81]:
a = A.from_tuple((1., 2.))
print(a)
a = 1.0, b = 2.0

The @staticmethod decorator¶

The @staticmethod decorator defines static methods. Static methods can have no arguments at all. Static methods live inside the namespace of a class object. They can be called either from the class object or an instance of the class.

In [82]:
class Angle(object):
    r"""A class with a static method

    """
    def __init__(self, value=0.):
        self.value = value

    @staticmethod
    def deg2rad(angle):
        return angle * np.pi / 180.
In [83]:
print("30deg = {:.2f}rad".format(Angle.deg2rad(30.)))
30deg = 0.52rad
In [84]:
a = Angle()
print("30deg = {:.2f}rad".format(a.deg2rad(30.)))
30deg = 0.52rad

Operator methods¶

  • Python classes define magic methods for implementing operators, like +, -, *, etc.

  • See emulating numeric types for a full list of possible operators.

  • Binary arithmetic operators:

    • +: __add__(self, rhs)
    • -: __sub__(self, rhs)
    • *: __mul__(self, rhs)
    • /: __div__(self, rhs)
    • @: __matmul__(self, rhs)
    • %: __mod__(self, rhs)
    • &: __and__(self, rhs)
    • |: __or__(self, rhs)
  • Reflected (swapped) binary operators use __r<op>__(self, lhs), e.g. __radd__(self, lhs).

In [85]:
# Example for addition operator:
class Text(object):
    def __init__(self, value):
        self.value = value
    def __add__(self, rhs):
        return Text(self.value + str(rhs))
    def __str__(self):
        return self.value

t1 = Text('Hello')
t2 = Text(' ')
t3 = Text('World')

print(t1 + t2 + t3)
print(t1 + ' ' + t3)
print('Hello' + t2 + t3)
Hello World
Hello World
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-85-4a6eeadbd374> in <module>
     14 print(t1 + t2 + t3)
     15 print(t1 + ' ' + t3)
---> 16 print('Hello' + t2 + t3)

TypeError: can only concatenate str (not "Text") to str

The last expression print('Hello' + t2 + t3) can be supported via a reflected add operator:

In [86]:
# Example for addition operator with reflected addition operator:
class Text(object):
    def __init__(self, value):
        self.value = value
    def __add__(self, rhs):
        return Text(self.value + str(rhs))
    def __radd__(self, lhs):
        return Text(str(lhs) + self.value)
    def __str__(self):
        return self.value

t2 = Text(' ')
t3 = Text('World')

print('Hello' + t2 + t3)
Hello World
  • Rich comparison operators and their corresponding magic class methods:
    • >: __gt__(self, rhs)
    • >=: __ge__(self, rhs)
    • <: __lt__(self, rhs)
    • <=: __le__(self, rhs)
    • ==: __eq__(self, rhs)
    • !=: __ne__(self, rhs)
    • Should return True or False.
  • Augmented arithmetic assignments (in-place operators): __i<op>__(self, rhs), e.g. __iadd__(self, rhs).
    • +=: __iadd__(self, rhs)
    • -=: __isub__(self, rhs)
    • *=: __imul__(self, rhs)
    • Should return the object itself.

Check for attributes¶

The Python in-built function hasattr(obj, name) can be used to check if an object has an attribute of a given name.

In [87]:
angle = Angle(42)
print(hasattr(angle, 'deg2rad'))
True

Propper naming scheme¶

The PEP8 Style Guide defines a propper naming scheme for classes, methods names and properties.

  • Classes use the CapWords convention, e.g. MyClass.
  • Methods use lowercase, with words separated by underscores, e.g. print_a_number.
  • "Private" attributes start with an underscore (_), e.g. _a_private_number.

Inheritance¶

In order to evolve classes in OOP, inheritance exists. Classes can be derived from base classes and inherit their properties and methods. This way code can be reused and development time can be saved.

  • Methods can be overwritten by derived classes.
  • The super() function can be used to call methods of the parent class.
In [88]:
class Animal(object):
    """This is the base class.
    """
    def __init__(self, species, age):
        self.species = species
        self.age = age

    # Instance method.
    def description(self):
        print(f'{self.species} is {self.age} years old')

    @property
    def age(self):
        return self._age
    @age.setter
    def age(self, age):
        if age >= 0:
            self._age = age
        else:
            raise ValueError(
                f'Provided {age} age has to be a positive number.')
In [89]:
class Dog(Animal):
    """The Dog class is derived from the base class `Animal`.
    """
    def __init__(self, age):
        # Call the constructor of the parent class.
        # This first creates an animal object, which is then extended.
        super().__init__('dog', age)

class Cat(Animal):
    def __init__(self, age):
        super().__init__('cat', age)

    def description(self):
        """This methods overwrites the parent's description method.
        """
        print(f'I am a {self.species}')
In [90]:
dog = Dog(5)
dog.description()
dog is 5 years old
In [91]:
cat = Cat(6)
cat.description()
I am a cat
In [92]:
print(type(cat))
print(type(cat.age))
print(type(cat.description))
<class '__main__.Cat'>
<class 'int'>
<class 'method'>

isinstance and issubclass¶

To check if an instance is of some class we can use the built-in function isinstance() which takes two arguments, an instance object and a class object and returns True if the given class is anywhere in the inheritance chain of the instance's class:

In [93]:
print(isinstance(cat, Cat))
print(isinstance(cat, Dog))
print(isinstance(cat, Animal))
True
False
True

We can also check if the given class is a subclass of another class with the built-in function issublass().

In [94]:
print(issubclass(Cat, Animal))
print(issubclass(Cat, Cat))
print(issubclass(Cat, Dog))
True
True
False

Multiple inheritance¶

  • A class can be derived from more than one base class.
  • The parent class constructor should always be called.
  • Additional argumets *args and keyword arguments **kwargs need to be passed on to the parent.
In [95]:
class IsPuppy(object):
    def __init__(self, *args, **kwargs):
        # We need to call the parent's constructor
        # to call the constructor of ALL base classes of a derived class.
        super().__init__(*args, **kwargs)

    def description(self):
        print(f'I am a puppy.')

class Puppy(Dog, IsPuppy):
    def __init__(self, age):
        super().__init__(age)
In [96]:
puppy_dog = Puppy(1)
puppy_dog.description()
dog is 1 years old

Python determines which description to call using Method Resolution Order (MRO). We can check the order using the mro() method:

In [97]:
print(Puppy.mro())
[<class '__main__.Puppy'>, <class '__main__.Dog'>, <class '__main__.Animal'>, <class '__main__.IsPuppy'>, <class 'object'>]

Changing the MRO to call IsPuppy's description first.

In [98]:
class Puppy(IsPuppy, Dog):
    def __init__(self, age):
        super().__init__(age)
In [99]:
print(Puppy.mro())

puppy_dog = Puppy(1)
puppy_dog.description()
[<class '__main__.Puppy'>, <class '__main__.IsPuppy'>, <class '__main__.Dog'>, <class '__main__.Animal'>, <class 'object'>]
I am a puppy.

Abstract Base Classes¶

  • Abstract base classes can be used to define interfaces.
  • Methods can be declared abstract and must be implemented by derived classes before a class can be instantiated.
  • The abc package provides abstract base class functionality.
    • See https://docs.python.org/3/library/abc.html
  • Abstract base classes are defined by setting the class's metaclass keyword with metaclass=abc.ABCMeta.
  • Abstract methods (instance methods, class methods, properties) are defined with @abstractmethod decorator.
In [100]:
import abc

class AnimalBase(object, metaclass=abc.ABCMeta):
    def __init__(self, species, age):
        self.species = species
        self.age = age

    # Abstract instance method.
    @abc.abstractmethod
    def description(self):
        # We use `pass` as a placeholder for the
        # implementation by the derived class.
        pass

    # Define age property with abstract setter method.
    @property
    def age(self):
        return self._age
    @age.setter
    @abc.abstractmethod
    def age(self, age):
        pass

Class derived from abc.ABCMeta cannot be instantiated unless all of its abstract methods and properties are implemented.

In [101]:
class Dog(AnimalBase):
    def __init__(self, age):
        super().__init__('dog', age)
In [102]:
dog = Dog(5)
dog.description()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-102-8c735f9acb69> in <module>
----> 1 dog = Dog(5)
      2 dog.description()

TypeError: Can't instantiate abstract class Dog with abstract methods age, description

Lets fix it by implementing the description and age methods.

In [103]:
class Dog(AnimalBase):
    def __init__(self, age):
      super().__init__('dog', age)

    # Implement the description instance method.
    def description(self):
        print('I am the implementation of the description instance method.')
        print(f'The {self.species} is of age {self.age}.')

    # Implement the setter method of the age property.
    @AnimalBase.age.setter
    def age(self, age):
        if age < 0:
            print(f'Provided {age} age has to be positive number.')
            self._age = None
        else:
            self._age = age
In [104]:
dog = Dog(-5)
dog.description()
Provided -5 age has to be positive number.
I am the implementation of the description instance method.
The dog is of age None.
In [105]:
dog = Dog(2)
dog.description()
I am the implementation of the description instance method.
The dog is of age 2.

Iterator protocol¶

If a class implements a container, the iterator protocol can be used to iterate over the items of the container. See https://docs.python.org/3/library/stdtypes.html#iterator-types for details.

The interator protocol consists of two special instance methods: __iter__ and __next__.

The __iter__ instance method must return the iterator object, usually the class instance itself.

The iter() built-in function can be used to get an iterator of an object.

The __next__ instance method must return the next element of the container. If it raises the StopIteration exception, the iteration stops.

In [106]:
# Example of a container with the iterator protocol supported.
class MyBox(object):
    def __init__(self, items):
        self.items = items
    def __iter__(self):
        return MyBoxIterator(self)

# We define an iterator class that knows how to iterate through
# the items of the box.
class MyBoxIterator(object):
    def __init__(self, box):
        self.box = box
        # The index attribute points to the next item in the box.
        self.index = 0
    def __iter__(self):
        return self
    def __next__(self):
        # Check if we reached the end of the item sequence.
        if self.index == len(self.box.items):
            raise StopIteration()
        self.index += 1
        return self.box.items[self.index-1]
In [107]:
box = MyBox([8, 'a', 42, 'hello'])
items = [item for item in box]
print(items)
[8, 'a', 42, 'hello']

Our box class uses a sequence for storing the items. Sequences have iterators already implemented in Python, hence we can simplify the previous example by utilizing the iterator of the item sequence using the iter() built-in function.

In [108]:
class MySimpleBox(object):
    def __init__(self, items):
        self.items = items
    def __iter__(self):
        return iter(self.items)
In [109]:
box = MySimpleBox([8, 'a', 42, 'hello'])
for item in box:
    print(item)
8
a
42
hello

Context Manager¶

Context managers in Python allow you to allocate and release resources precisely when you want to. The most widely used example of context managers is the with statement.

In [110]:
with open('myfile.txt', 'w') as f:
    f.write('Hello World!')

This will open the file myfile.txt in write mode and enters a context. Once the block is finished, the context is exited. In this example the file will be closed automatically on exit.

Python implements context managers via two magic class methods:

  • __enter__(self)
  • __exit__(self, type, value, traceback)
In [111]:
class MyFile(object):
    def __init__(self, filename, mode):
        self.filename = filename
        self.mode = mode

    def __enter__(self):
        print(f'Opening file {self.filename} in mode {self.mode}')
        self.f = open(self.filename, self.mode)
        return self.f

    def __exit__(self, type, value, traceback):
        print(f'Closing file {self.filename}')
        self.f.close()

myfile = MyFile('myfile.txt', 'r')

with myfile as f:
    print(f.readlines())
Opening file myfile.txt in mode r
['Hello World!']
Closing file myfile.txt
  • The type, value, and traceback arguments of the __exit__ method allows the handling of exceptions.
  • If an exception is handled by the __exit__ method, the methods needs to return True.

Generators¶

Recap:¶

There are a couple of useful extra functions for loops (also called generators). The two most common generators are:

  • range(start (default=0), stop, stepsize (default=1)) -> creates a counting iterable, ie. 0, 1, 2, 3, ... stop. For a large number of iterations, range is much more memory-efficient than looping over the corresponding list of [0, 1, 2, ...], because the items are generated on the fly.
  • enumerate(iterable) -> returns the index and the value of an iterable

For more info on where generators are useful and how to write your own generator functions, see e.g. https://realpython.com/introduction-to-python-generators/

Generator function¶

A function which returns a generator iterator. It looks like a normal function except that it contains yield expressions for producing a series of values usable in a for-loop or that can be retrieved one at a time with the next() function.

In [112]:
def custom_range_generator(n):
    i = 0
    while i < n:
        yield i
        i += 1
In [113]:
range_generator = custom_range_generator(5)
print("next():", next(range_generator))
print("type:", type(range_generator))
print("list remaining numbers:", list(range_generator))
next(): 0
type: <class 'generator'>
list remaining numbers: [1, 2, 3, 4]
In [114]:
# We have to create the generator again as we exhausted its items by calling
# `list` function earlier.
range_generator = custom_range_generator(5)
# Example in for loops
for i in range_generator:
    print(i)
0
1
2
3
4

Working with version control systems (git)¶

What is a version control system?¶

Version Control

  • Allows to keep track of changes (versions) of files.
  • Tracks the history of a file.
  • Manages a repository, i.e. a collection of files, usually a directory tree.
  • Allows collaborating on same files by several people.
    • Conflict resolution
  • Allows branching-off from the main development path, i.e. branches.
    • Branches can be merged

What is git?¶

  • A version control system developed by the inventor of the Linux kernel, Linus Torvalds.
  • A decentralized system
    • repositories are stored locally on the user's computer
    • a central (remote) repository might exists, e.g. github.com

Cloning a repository¶

To clone a repository use the git clone <repo_url>.git <local_dir> command:

git clone https://github.com/icecube/skyllh.git skyllh

This copies (clones) the master branch of the repository from the remote location to the local disk.

Working with branches¶

Git Branches

  • A branch represents an independent line of development.
  • Always use a new branch when making large code changes, e.g. new feature development or bug fix!

List available branches¶

To list all available remote branches type:

git branch -a

Creating a new branch¶

To work on a new feature or bug fix, one should create a new branch from the master branch.

  • Clone the repository.
  • Checkout a new branch via the git checkout -b <branch> command.
git clone https://github.com/icecube/skyllh.git skyllh.my_new_feature
cd skyllh.my_new_feature/
git checkout -b my_new_feature

After creating the new branch it lives in your local repository. In order to copy it to the remote repository one uses the git push --set-upstream origin <branch> command:

git push --set-upstream origin my_new_feature

To see on which branch you are currently working on use

git branch

to get the list of available branches.

Commiting code changes¶

After making code changes, these changes must be committed to the branch as a commit.

To see the changed files:

git status

To see the changes of a file:

git diff <file>

To stage a changed file for a commit, use the git add command:

git add <file>

To commit all staged files:

git commit -m "My commit description"

Merging two branches¶

After a bug was fixed or a new feature was implemented the development branch needs to be merged with the master branch.

Merging two branches

  • Make sure to be in the correct receiving branch, e.g. master:
git checkout master
  • Fetch and pull the latest updates of the receiving branch:
git fetch
git pull
  • Marge the feature branch into the master branch:
git merge my_new_feature
  • Delete the feature branch:
git branch -d my_new_feature

Resolving merge conflicts¶

  • Merge conflicts occur when the same part of the same file was changed by both branches.
  • git merge will fail before creating the merge commit.
  • git status will tell where merge conflics are.
here is some content not affected by the conflict
<<<<<<< master
this is conflicted text from master
=======
this is conflicted text from my_new_feature branch
>>>>>>> my_new_feature;
  • Edit the conflicting files to resolve the conflict.
  • Stage the changed files with git add.
  • Perform a final merge commit with git commit.

Pull requests on github.com¶

  • github.com provides the functionality to create pull requests.
  • Pull requests are branches that ask to be pulled into the master branch via a merge operation.
  • Pull requests can be code-reviewed before merging.

Profiling your Python program¶

  • Profilers tell you where your Python programs spends the most time and what memory it uses.
  • Available profilers and their properties: List of profilers
  • A good and fast profiler is scalene: github.com/plasma-umass/scalene

Profiling with scalene¶

  • Installation via pip: pip install scalene
  • Works within Jupyter Notebook cells (but not in Google colab!):
In [1]:
%load_ext scalene
Scalene extension successfully loaded. Note: Scalene currently only
supports CPU+GPU profiling inside Jupyter notebooks. For full Scalene
profiling, use the command line version.
In [2]:
%%scalene
# Profile more than one line of code in a cell
x = 0
for i in range(1000):
    for j in range(1000):
        x += 1
  • Works on the command line:

    scalene --reduced-profile --- my_program.py --my_option value

  • Your program's command line comes after the three dashes (---)

  • Produces and displays a webpage with the profiling information

Unit Tests with Python¶

  • Python provides the unittest package implementing a unit test framework.
  • See docs.python.org/3/library/unittest.html for documentation.
  • Unit tests should test small pieces of software, e.g. functions or class methods for its correct behavior.
  • Write a unit test after fixing a bug to make sure the bug is not re-introduced in the future!
In [138]:
import unittest

class TestStringMethods(unittest.TestCase):

    def test_upper(self):
        self.assertEqual('foo'.upper(), 'FOO')

    def test_isupper(self):
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

    def test_split(self):
        s = 'hello world'
        self.assertEqual(s.split(), ['hello', 'world'])
        # check that s.split fails when the separator is not a string
        with self.assertRaises(TypeError):
            s.split(2)

if __name__ == '__main__':
    unittest.main()
  • Test cases are derived classes from the unittest.TestCase base class.
  • Tests are implemented via class instance methods with names starting with test.
  • Special assert methods are used to ensure correct variable values.
  • Numpy provides the numpy.testing module with useful assert functions for testing numpy arrays.

About this document¶

This notebook can be found on github.com:

https://github.com/martwo/teaching/tree/master/WS_2022_23/advanced_python

and can be cloned via:

git clone https://github.com/martwo/teaching.git

Converting this notebook into slides and launching a browser window opening them:

jupyter nbconvert advanced_python.ipynb --to slides --post serve